AITopics | dyna-q algorithm

Collaborating Authors

dyna-q algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Data-driven inventory management for new products: A warm-start and adjusted Dyna-$Q$ approach

Qu, Xinye, Liu, Longxiao, Huang, Wenjie

arXiv.org Artificial IntelligenceJan-14-2025

-- In this paper, we propose a novel reinforcement learning algorithm for inventory management of newly launched products with no historical demand information. The algorithm follows the classic Dyna-Q structure, balancing the model-free and model-based approaches, while accelerating the training process of Dyna-Q and mitigating the model discrepancy generated by the model-based feedback. Based on the idea of transfer learning, warm-start information from the demand data of existing similar products can be incorporated into the algorithm to further stabilize the early-stage training and reduce the variance of the estimated optimal policy. Our approach is validated through a case study of bakery inventory management with real data. The adjusted Dyna-Q shows up to a 23.7% reduction in average daily cost compared with Q-learning, and up to a 77.5% reduction in training time within the same horizon compared with classic Dyna-Q . By using transfer learning, it can be found that the adjusted Dyna-Q has the lowest total cost, lowest variance in total cost, and relatively low shortage percentages among all the benchmarking algorithms under a 30-day testing. I. INTRODUCTION Inventory management is crucial for supply chain operations, overseeing and controlling the order, storage, and usage of goods in business [1]. In inventory management, the cold-start setting refers to predicting demand and formulating appropriate inventory strategies when new products are introduced or new market demands arise due to the lack of historical data [2].

algorithm, dyna-q algorithm, new product, (14 more...)

arXiv.org Artificial Intelligence

2501.08109

Country:

Asia > China > Hong Kong (0.05)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Look at Value-Based Decision-Time vs. Background Planning Methods Across Different Settings

Alver, Safa, Precup, Doina

arXiv.org Artificial IntelligenceAug-12-2024

In model-based reinforcement learning (RL), an agent can leverage a learned model to improve its way of behaving in different ways. Two of the prevalent ways to do this are through decision-time and background planning methods. In this study, we are interested in understanding how the value-based versions of these two planning methods will compare against each other across different settings. Towards this goal, we first consider the simplest instantiations of value-based decision-time and background planning methods and provide theoretical results on which one will perform better in the regular RL and transfer learning settings. Then, we consider the modern instantiations of them and provide hypotheses on which one will perform better in the same settings. Finally, we perform illustrative experiments to validate these theoretical results and hypotheses. Overall, our findings suggest that even though value-based versions of the two planning methods perform on par in their simplest instantiations, the modern instantiations of value-based decision-time planning methods can perform on par or better than the modern instantiations of value-based background planning methods in both the regular RL and transfer learning settings.

algorithm, deep dyna-q algorithm, dyna-q algorithm, (15 more...)

arXiv.org Artificial Intelligence

2206.08442

Country:

North America > Canada > Quebec > Montreal (0.28)
North America > Canada > Alberta (0.04)
North America > Barbados (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback